Data Dependent Distance Metric for Efficient Gaussian Processes Classification

نویسنده

Nayyar A. Zaidi

چکیده

The contributions of this work are threefold. First, various metric learning techniques are analyzed and systematically studied under a unified framework to highlight the criticality of data-dependent distance metric in machine learning. The metric learning algorithms are categorized as naive, semi-naive, complete and high-level metric learning, under a common distance measurement framework. Secondly, the connection of feature selection, feature weighting, feature partitioning, kernel tuning, etc. with metric learning is discussed and it is shown that they are all in fact forms of metric learning. Thirdly, it has been shown that the realm of metric learning is not limited to k-nearest neighbor (k-NN) classification, and that a metric optimized in the k-nearest neighbor setting is likely to be effective and applicable in other kernel-based frameworks, for example Support Vector Machine (SVM) and Gaussian Processes (GP) classifiers. We support our hypotheses by tuning the length-scale parameters of GP with metric learning method proposed in k-NN framework. Our empirical results on a huge range of machine learning databases suggest that a metric optimized in the framework of one learning algorithm is likely to be effective in those of others.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An efficient weighted nearest neighbour classifier using vertical data representation

The k-nearest neighbour (KNN) technique is a simple yet effective method for classification. In this paper, we propose an efficient weighted nearest neighbour classification algorithm, called PINE, using vertical data representation. A metric called HOBBit is used as the distance metric. The PINE algorithm applies a Gaussian podium function to set weights to different neighbours. We compare PIN...

متن کامل

An Information Geometry Approach for Distance Metric Learning

In this paper, we propose a framework for metric learning based on information geometry. The key idea is to construct two kernel matrices for the given training data: one is based on the distance metric and the other is based on the assigned class labels. Inspired by the idea of information geometry, we relate these two kernel matrices to two Gaussian distributions, and the difference between t...

متن کامل

Non-Euclidean metrics for similarity search in noisy datasets

In the context of classification, the dissimilarity between data elements is often measured by a metric defined on the data space. Often, the choice of the metric is often disregarded and the Euclidean distance is used without further inquiries. This paper illustrates the fact that when other noise schemes than the white Gaussian noise are encountered, it can be interesting to use alternative m...

متن کامل

Composite Kernel Optimization in Semi-Supervised Metric

Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...

متن کامل

تشخیص سرطان پستان با استفاده از برآورد ناپارمتری چگالی احتمال مبتنی بر روش‌‌های هسته‌ای

Introduction: Breast cancer is the most common cancer in women. An accurate and reliable system for early diagnosis of benign or malignant tumors seems necessary. We can design new methods using the results of FNA and data mining and machine learning techniques for early diagnosis of breast cancer which able to detection of breast cancer with high accuracy. Materials and Methods: In this study,...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Data Dependent Distance Metric for Efficient Gaussian Processes Classification

نویسنده

چکیده

منابع مشابه

An efficient weighted nearest neighbour classifier using vertical data representation

An Information Geometry Approach for Distance Metric Learning

Non-Euclidean metrics for similarity search in noisy datasets

Composite Kernel Optimization in Semi-Supervised Metric

تشخیص سرطان پستان با استفاده از برآورد ناپارمتری چگالی احتمال مبتنی بر روش‌‌های هسته‌ای

عنوان ژورنال:

اشتراک گذاری